Putting OAC-triclustering on MapReduce

نویسندگان

  • Sergey Zudin
  • Dmitry Gnatyshak
  • Dmitry I. Ignatov
چکیده

In our previous work an efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) was proposed. This algorithm is a modified version of the basic algorithm for OACtriclustering approach; it has linear time and memory complexities. In this paper we parallelise it via map-reduce framework in order to make it suitable for big datasets. The results of computer experiments show the efficiency of the proposed algorithm; for example, it outperforms the online counterpart on Bibsonomy dataset with ≈ 800, 000 triples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Triadic FCA to Triclustering: Experimental Comparison of Some Triclustering Algorithms

In this paper we show the results of the experimental comparison of five triclustering algorithms on real-world and synthetic data wrt. resource efficiency and 4 quality measures. One of the algorithms, the OAC-triclustering based on prime operators, is presented first time in this paper. Interpretation of results for real-world datasets is provided.

متن کامل

A One-pass Triclustering Approach: Is There any Room for Big Data?

An efficient one-pass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OAC-triclustering approach, but it has linear time and memory complexities with respect to the cardinality of the underlying ternary relation and can be easily parallelized in order to be applied for the analysis of big da...

متن کامل

BSP vs MapReduce

The MapReduce framework has been generating a lot of interest in a wide range of areas. It has been widely adopted in industry and has been used to solve a number of non-trivial problems in academia. Putting MapReduce on strong theoretical foundations is crucial in understanding its capabilities. This work links MapReduce to the BSP model of computation, underlining the relevance of BSP to mode...

متن کامل

Visual Analytics in FCA-based Clustering

Visual analytics is a subdomain of data analysis which combines both human and machine analytical abilities and is applied mostly in decision-making and data mining tasks. Triclustering, based on Formal Concept Analysis (FCA), was developed to detect groups of objects with similar properties under similar conditions. It is used in Social Network Analysis (SNA) and is a basis for certain types o...

متن کامل

Sorting, Searching, and Simulation in the MapReduce Framework

In this paper, we study the MapReduce framework from an algorithmic standpoint and demonstrate the usefulness of our approach by designing and analyzing efficient MapReduce algorithms for fundamental sorting, searching, and simulation problems. This study is motivated by a goal of ultimately putting the MapReduce framework on an equal theoretical footing with the well-known PRAM and BSP paralle...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015